Exploiting deep neural networks and head movements for binaural localisation of multiple speakers in reverberant conditions

نویسندگان

  • Ning Ma
  • Guy J. Brown
  • Tobias May
چکیده

This paper presents a novel machine-hearing system that exploits deep neural networks (DNNs) and head movements for binaural localisation of multiple speakers in reverberant conditions. DNNs are used to map binaural features, consisting of the complete cross-correlation function (CCF) and interaural level differences (ILDs), to the source azimuth. Our approach was evaluated using a localisation task in which sources were located in a full 360-degree azimuth range. As a result, frontback confusions often occurred due to the similarity of binaural features in the front and rear hemifields. To address this, a head movement strategy was incorporated in the DNN-based model to help reduce the front-back errors. Our experiments show that, compared to a system based on a Gaussian mixture model (GMM) classifier, the proposed DNN system substantially reduces localisation errors under challenging acoustic scenarios in which multiple speakers and room reverberation are present.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Binaural deep neural network classification for reverberant speech segregation

While human listening is robust in complex auditory scenes, current speech segregation algorithms do not perform well in noisy and reverberant environments. This paper addresses the robustness in binaural speech segregation by employing binary classification based on deep neural networks (DNNs). We systematically examine DNN generalization to untrained configurations. Evaluations and comparison...

متن کامل

Binaural Reverberant Speech Separation Based on Deep Neural Networks

Supervised learning has exhibited great potential for speech separation in recent years. In this paper, we focus on separating target speech in reverberant conditions from binaural inputs using supervised learning. Specifically, deep neural network (DNN) is constructed to map from both spectral and spatial features to a training target. For spectral features extraction, we first convert binaura...

متن کامل

Exploiting top-down source models to improve binaural localisation of multiple sources in reverberant environments

Relatively few systems for machine hearing exploit top-down information in source localisation, despite there being clear evidence for top-down (e.g., attentional) effects in biological spatial hearing. This paper addresses this issue by proposing a framework for binaural sound localisation that exploits topdown knowledge about the source spectral characteristics in the acoustic scene. Informat...

متن کامل

Speech Localisation in a Multitalker Mixture by Humans and Machines

Speech localisation in multitalker mixtures is affected by the listener’s expectations about the spatial arrangement of the sound sources. This effect was investigated via experiments with human listeners and a machine system, in which the task was to localise a female-voice target among four spatially distributed male-voice maskers. Two configurations were used: either the masker locations wer...

متن کامل

On a Binaural Model with Front-back Discriminator using Artificial Neural Network trained by multiple HRTF catalogs

Various binaural models have been proposed for the application of hearing assistance system as well as humanoid robot, and a frequency domain binaural model(FDBM) is the one. Like other binaural models, the original FDBM can separate and segregate a signal from the specific direction based on interaural information, but it works only in the frontal semisphere due to front-back confusion. In ord...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015